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Abstract. We prove that the Sherrington-Kirkpatrick model of spin 
glasses is chaotic under small perturbations of the couplings at any tem- 
perature in the absence of an external field. The result is proved for 
two kinds of perturbations: (a) distorting the couplings via Ornstein- 
Uhlenbeck flows, and (b) replacing a small fraction of the couplings by 
independent copies. We further prove that the S-K model exhibits mul- 
tiple valleys in its energy landscape, in the weak sense that there are 
many states with near-minimal energy that are mutually nearly orthog- 
onal. We show that the variance of the free energy of the S-K model is 
unusually small at any temperature. (By 'unusually small' we mean that 
it is much smaller than the number of sites; in other words, it beats the 
classical Gaussian concentration inequality, a phenomenon that we call 
'superconcentration'.) We prove that the bond overlap in the Edwards- 
Anderson model of spin glasses is not chaotic under perturbations of 
the couplings, even large perturbations. Lastly, we obtain sharp lower 
bounds on the variance of the free energy in the E-A model on any 
bounded degree graph, generalizing a result of Wehr and Aizenman and 
establishing the absence of superconcentration in this class of models. 
Our techniques apply for the p-spin models and the Random Field Ising 
Model as well, although we do not work out the details in these cases. 
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1. Introduction 

Spin glasses are magnetic materials with strange properties that distin- 
guish them from ordinary ferromagnets. In statistical physics, the study of 
spin glasses originated with the works of Edwards and Anderson [11] and 
Sherrington and Kirkpatrick [33j in 1975. In the following decade, the the- 
oretical study of spin glasses led to the invention of deep and powerful new 
methods in physics, most notably Parisi's broken replica method. We refer 
to [26] for a survey of the physics literature. 

However, these physical breakthroughs were far beyond the reach of rig- 
orous proof at the time, and much of it remains so till date. The rigor- 
ous analysis of the Sherrington-Kirkpatrick model began with the works of 
Aizenman, Lebowitz and Ruelle [T] and Frohlich and Zegarlihski [15] in the 
late eighties; the field remained stagnant for a while, interspersed with a few 
nice papers occasionally (e.g. [8], [32]). The deepest mysteries of the broken 
replica analysis of the S-K model remained mathematically intractable for 
many more years until the path-breaking contributions of Guerra, Toninelli, 
Talagrand, Panchenko and others in the last ten years (see e.g. [2], [19], [18j . 
[30] . [17], [34], [35]). Arguably the most notable achievement in this period 
was Talagrand's proof of the Parisi formula [35J. 
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However, in spite of all this remarkable progress, our understanding of 
these complicated mathematical objects is still shrouded in mystery, and 
many conjectures remain unresolved. In this article we attempt to give a 
mathematical foundation to some aspects of spin glasses that have been well- 
known in the physics community for a long time but never before penetrated 
by rigorous mathematics. Let us now embark on a description of our main 
results. Further references and connections with the literature will be given 
at the appropriate places along the way. 

1.1. Weak multiple valleys in the S-K model. Consider the following 
simple-looking probabilistic question: Suppose (gij)i<ij<N are i.i.d. stan- 
dard Gaussian random variables, and we define, for each cr £ {—1, 1} , the 
quantity 

(1) X N (a) := ^ 9ijVi°j- 

l<i,j<N 

Then is it true that with high probability, there is a large subset A of 
{-1,1}^ such that 

(2) Xjqicr) ~ max X^ia') for each a G A, 

<r'e{-i,i} N 

and any two distinct elements a 1 , a 2 of A are nearly orthogonal, in the sense 
that 

S^N 1 2 

(3) = Rx,2 ■= ^=^ a * ~ 0? 

(In the spin glass literature, the quantity Ri^ is called the 'overlap' between 
the 'configurations' cr 1 and a 2 .) To realize the non-triviality of the question, 
consider a slightly different Gaussian field Y/v on {—1,1}^, defined as 

N 

Y N (a) := ^gm, 

i=i 

where gx, . . . , gx are i.i.d. standard Gaussian random variables. Then clearly, 
Y]\f is maximized at <r, where &i = sign(li). Note that for any cr, 

Y N (a)= \n- E 

i: (7i=cii i: (7i = — (J{ 

It is not difficult to argue from here that if cr is another configuration that 
is near-maximal for Yn, then a must agree with a at nearly all coordinates. 
Thus, the field Yn does not satisfy the 'multiple peaks picture' that we are 
investigating about X]y. This is true in spite of the fact that Yn(<j) and 
Yn(<t') are approximately independent for almost all pairs (a, a'). 

We have the following result about the existence of multiple peaks in the 
field X/v- It says that with high probability, there is a large collection A of 
configurations satisfying ([2]) and ([3|), that is, R^i CT 2 ~ for any two distinct 
cr ] ,<t 2 G A, and Xn(ct) ~ max ff / Xn(ct') for each a G A. 
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Theorem 1.1. Let be the field defined in ([IJ) ; and define the overlap 
R a i a 2 between configurations a 1 ,^ 2 by the formula (|3|). Let 

Mn '■= maxA^v(er). 

Then there are constants r^r — * oo, 7^ — > 0, ejy — * 0, and 5n —> swe/t that 
with probability at least 1 — 77V, there is a set A C { — 1, 1}^ satisfying 

(a) |A| > r N , 

(b) i? 2 x 2 < en for all a 1 , a 2 G A, a 1 ^ er 2 , and 

(c) XAr(cr) > (1 - 5n)Mn for all a £ A. 

Quantitatively, we can take = (logiV) 1 / 8 , 5jy = (log A r ) _1//8 ; eAr = 
g-(iogAf) 1 / 8 an dj N = C(log A^) _1//12 ; where C is an absolute constant. How- 
ever these are not necessarily the best choices. 

Let us now discuss the implication of this result in spin glass theory. The 
Sherrington-Kirkpatrick model of spin glasses, introduced in [33], is defined 
through the Hamiltonian (i.e. energy function) 



1 „ , , 1 

v 

l<i,j<N 



(4) H N {a) := ^=X N (a) = ~~^= Yl dij^j- 



The S-K model at inverse temperature (3 > defines a probability measure 
Gn on { — 1, 1}^ through the formula 



(5) G N {{*}) := Z{(5)- l e 



-(3H n {<t) 



where Z{(3) is the normalizing constant. The measure Gn is called the Gibbs 
measure. 

According to the folklore in the statistical physics community, the energy 
landscape of the S-K model has 'multiple valleys'. Although no precise for- 
mulation is available, one way to view this is that there are many nearly or- 
thogonal states with nearly minimal energy. For a physical discussion of the 
'many states' aspect of the S-K model, we refer to Chapter III. A very 
interesting rigorous formulation was attempted by Talagrand (see [34J , Con- 
jecture 2.2.23), but no theorems were proved. Although our achievement 
is quite modest, and may not be satisfactory to the physicists because we 
do not prove that the approximate minimum energy states correspond to 
significantly large regions of the state space — in fact, one may say that it 
is not what is meant by the physical term 'multiple valleys' at all because an 
isolated low energy state does not necessarily represent a valley — it does 
seem that Theorem 11.11 is the first rigorous result about the multimodal 
geometry of the Sherrington-Kirkpatrick energy landscape. We may call it 
'multiple valleys in a weak sense'. 

Theorem 11.11 can be generalized to the following Corollary, which shows 
that weak multiple valleys exist at 'every energy level' and not only for the 
lowest energy. 
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Corollary 1.2. Let all notation be the same as in Theorem 11.11 Fix a 
number a G (0, 1] . Then for all sufficiently large N , with probability at least 
1 — 277V there exists a set A C {—1,1}-^ satisfying conditions (a) and (b) 
of Theorem^Al such that \X N (a) - aM N \ < 5n\M n \ for all cr £ A. 

The variables (gij)i<i.j<N m the Hamiltonian Hn are collectively called 
the 'couplings' or the 'disorder'. Our proof of Theorem II .11 is based on the 
chaotic nature of the S-K model under small perturbations of the couplings; 
this is discussed in the next subsection. The relation between chaos and 
multiple valleys follows from a general principle outlined in [7] , although the 
proof in the present paper is self-contained. 

1.2. Disorder chaos in the S-K model. Recall the Gibbs measure Gn 
of the S-K model, defined in ([5]). Suppose cr 1 and <r 2 are two configurations 
drawn independently according to the measure Gn, and the overlap is 
defined as in ([3]). It is known that when (3 < 1, i?i 5 2 — with high probability 
[13 El EH- However, it is also known that R\^ cannot be concentrated near 
zero for all /3, because that would give a contradiction to the existence of a 
phase transition as established in [I]. In fact, it is believed that the limiting 
distribution of in the low temperature phase is given by the so-called 
'Parisi measure', a notion first made rigorous by Talagrand [35"1 136]. 

Now suppose we choose <r 2 not from the Gibbs measure Gn, but from a 
new Gibbs measure G' N , based on a new Hamiltonian H' N which is obtained 
by applying a small perturbation to the Hamiltonian Hn- (We will make 
precise the notion of a small perturbation below.) Is it still true that R\ 2 has 
a non-degenerate limiting distribution at low temperatures? The conjecture 
of disorder chaos (i.e. chaos with respect to small fluctuations in the disorder 
(9ij)i<i,j<N) states that indeed that is not the case: R\ t 2 is concentrated 
near zero if cr 1 is picked from the Gibbs measure and cr 2 is picked from a 
perturbed Gibbs measure. This is supposed to be true at all temperatures. 
To the best of our knowledge, disorder chaos for the S-K model was first dis- 
cussed in the widely cited paper of Bray and Moore [5]; a related discussion 
appears in the earlier paper [25] . The phenomenon of chaos itself was first 
conjectured by Fisher and Huse [13J in the context of the Edwards- Anderson 
model, although the term was coined in [5]. Again, to the best of our knowl- 
edge, nothing has been proved rigorously yet. For further references in the 
physics literature, let us refer to the recent paper [2"3] . 

Note that this idea of chaos should not be confused with temperature 
chaos (also discussed in [5]), which says that spin glasses are chaotic with 
respect to small changes in the inverse temperature (3. 

We shall consider two kinds of perturbation of the disorder. The first, 
what we call 'discrete perturbation', is executed by replacing a randomly 
chosen small fraction of the couplings {g%j) by independent copies. Here 
small fraction means a fraction p that goes to zero as N —* 00. Discrete 
perturbation is the usual way to proceed in the noise-sensitivity literature 
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(see e.g. (3j SJ El G7J HE]). In fact, it seems that the following result is inti- 
mately connected with noise-sensitivity, although we do not see any obvious 
way to use the standard noise-sensitivity techniques to derive it. 

Theorem 1.3. Consider the S-K model at inverse temperature (3. Take any 
N andp E [0, 1]. Suppose a randomly chosen fraction p of the couplings (gtj) 
are replaced by independent copies to give a perturbed Gibbs measure. Let 
er 1 be chosen from the original Gibbs measure and cr 2 is chosen from the 
perturbed measure. Let the overlap be defined as in ([3]). Then 



where C is an absolute constant and the expectation is taken over all ran- 
domness. 

This theorem shows that the system is chaotic if the fraction p goes to 
zero slower than 1 / log N. The derivation of this result is based on the 
'superconcentration' property of the free energy in the S-K model that we 
present in the next subsection. 

The notion of perturbation in the above theorem, though natural, is not 
the only available notion. In fact, in the original physics papers (e.g. [5]), 
a different manner of perturbation is proposed, which we call continuous 
perturbation. Here we replace gij by agij + bg'-, where (<^ -) is another 
set of indepenent standard Gaussian random variables and a 2 + b 2 = 1 so 
that the resultant couplings are again standard Gaussian. When a ~ 1, we 
say that the perturbation is small. A convenient way to parametrize the 
perturbation is to set a = e~ l , where t is a parameter that we call 'time'. 
This nomenclature is natural, because perturbing the couplings up to time 
t corresponds to running an Ornstein-Uhlenbeck flow at each coupling for 
time t, with initial value gij. The following theorem says that the S-K model 
is chaotic under small continuous perturbations. 

Theorem 1.4. Consider the S-K model at inverse temperature (3. Take any 
t > 0. Suppose we continuously perturb the couplings up to time t, as defined 
above. Let er 1 be chosen from the original Gibbs measure and a 2 be chosen 
from the perturbed measure. Let the overlap be defined as in ([3j). Then 
there is an absolute constant C such that for any positive integer k, 



The expectation is taken over all randomness. 

Again, the achievement is very modest, and does not come anywhere close 
to the claims of the physicists. But once again, this is the first rigorous result 
about chaos of any kind in the S-K model. To the best of our knowledge, 
the only other instance of a rigorous proof of chaos in any spin glass model 
is in the work of Panchenko and Talagrand [30] , who established chaos with 
respect to small changes in the external field in the spherical S-K model. 
Disorder chaos in directed polymers was established by the author in [7]. 
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A deficiency of both theorems in this subsection is that they do not cover 
the case of zero temperature, that is, j3 = oo, where Gibbs measure concen- 
trates all its mass on the ground state. In principle, the same techniques 
should apply, but there are some crucial hurdles that cannot be cleared with 
the available ideas. 

1.3. Superconcentration in the S-K model. The notion of supercon- 
centration was defined in [7J. The definition in [7J pertains only to maxima 
of Gaussian fields, but it can be generalized to roughly mean the following: a 
Lipschitz function of a collection of independent standard Gaussian random 
variables is superconcentrated whenever its order of fluctuations is much 
smaller than its Lipschitz constant. This definition is related to the classical 
concentration result for the Gaussian measure, which says that the order of 
fluctuations of a Lipschitz function under the Gaussian measure is bounded 
by its Lipschitz constant (see e.g. Theorem 2.2.4 in [S]), irrespective of the 
dimension. 

The free energy of the S-K model is defined as 
(6) F N (P):=hog ]T e -/MW 

where is the Hamiltonian defined in (J3J). It follows from classical con- 
centration of measure that the variance of F^{f3) is bounded by a constant 
multiple of ./V (see Corollary 2.2.5 in [Mj)- This is the best known bound for 
P > 1. When (3 < 1, Talagrand (Theorems 2.2.7 and 2.2.13 in [E]) proved 
that the variance can actually be bounded by an absolute constant. This 
is also indicated in the earlier works of Aizenman, Lebowitz and Ruelle [1] 
and Comets and Neveu [8]. Therefore, according to our definition, the free 
energy is superconcentrated when < 1. The following theorem shows that 
Fn is superconcentrated at any (3. 

Theorem 1.5. Let Fn{(3) be the free energy of the S-K model defined above 
in ([6]). For any (3, we have 

C7iVlog(2 + C/3) 

VbxF n {P) < , 

log N 

where C is an absolute constant. 

This result may be reminiscent of the log N improvement in the variance 
of first passage percolation time [4j. However, the proof is quite different 
in our case since hypercontractivity, the major tool in [JJ, does not seem 
to work for spin glasses in any obvious way. In that sense, the two results 
are quite unrelated. Our proof is based on our chaos theorem for contin- 
uous perturbation (Theorem II. 4p and ideas from [Tj. On the other hand, 
Theorem 11.51 is used to derive the chaos theorem for discrete perturbation, 
again drawing upon ideas from [7J . This equivalence between chaos and su- 
perconcentration is one of the main themes of [7] , which in a way shows the 
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significance of superconcentration, which may otherwise be viewed as just a 
curious phenomenon. 

Incidentally, it was shown by Talagrand ([37], eq. (10.13)) that the lower 
tail fluctuations of Fn(/3) are actually as small as order 1 under an unproven 
hypothesis about the Parisi measure. 

1.4. Disorder chaos in the E-A model. Let G = (V, E) be an undirected 
graph. The Edwards-Anderson spin glass [11] on G is defined through the 
Hamiltonian 

(7) H{*):=- Y, 9ij°i°j, <re{-l,l} V , 

(M)e£ 

where (gij) is again a collection of i.i.d. random variables, often taken to be 
Gaussian. The S-K model corresponds to the case of the complete graph, 
up to normalization by yN. 

For a survey of the (few) rigorous and non-rigorous results available for 
the Edwards- Anderson model, we refer to Newman and Stein [28] . 

Unlike the S-K model, there are two kinds of overlap in the E-A model. 
The 'site overlap' is the usual overlap defined in ([3]). The 'bond overlap' 
between two states cr 1 and <x 2 , on the other hand, is defined as 

(8) Ql > 2:= \W\ £ 

We show that the bond overlap in the E-A model is not chaotic with respect 
to small fluctuations of the couplings at any temperature. This does not say 
anything about the site overlap; the site overlap in the E-A model can well 
be chaotic with respect to small fluctuations of the couplings, as predicted 
in P3H5]. 

Theorem 1.6. Suppose the E-A Hamiltonian ([7]) on a graph G is contin- 
uously perturbed up to time t > 0, according to the definition of continuous 
perturbation in Section [1.21 Let cr 1 be chosen from the original Gibbs mea- 
sure at inverse temperature (3 and a 2 is chosen from the perturbed measure. 
Let the bond overlap Qi,2 be defined as in ([8]). Let 

q :=mm{p 2 ,^}, 

where d is the maximum degree of G. Then 

E(Qi >2 ) > Cqe-V *, 

where C is a positive absolute constant. Moreover, the result holds for f3 = oo 
also, with the interpretation that the Gibbs measure at (3 = oo is just the 
uniform distribution on the set of ground states. 

An interesting case of the above theorem is when t = 0. The result then 
says that if two configurations are drawn independently from the Gibbs 
measure, they have a non-negligible bond overlap with non-vanishing prob- 
ability. The fact that this holds at any finite temperature is in contrast with 
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the mean- field case (i.e. the S-K model), where there is a high-temperature 
phase (ft < 1) where the bond overlap becomes negligible. 

However, while Theorem 11.61 establishes that the bond overlap does not 
become zero for any amount of perturbation, it does exhibit a sort of 
'quenched chaos', in the following sense. 

Theorem 1.7. Fix t > and let Qi,2 be as in Theorem 11.61 Then 

That is, if we perturb the system by an amount t 3> l-E 1 ] -1 , the bond 
overlap between two configurations drawn from the two Gibbs measures is 
approximately equal to the quenched average of the overlap. In physical 
terms, the overlap 'self-averages'. 

The combination of the last two theorems brings to light a surprising 
phenomenon. On the one hand, the perturbation retains a memory of the 
original Gibbs measure, because the overlap is non-vanishing in Theorem ll.6l 
On the other hand, the perturbation causes a chaotic reorganization of the 
Gibbs measure in such a way that the overlap concentrates on a single value 
in Theorem 11.71 The author can see no clear explanation of this confusing 
outcome. 

1.5. Absence of superconcentration in the E-A model. The proof of 
Theorem II .61 is based on the following result, which says that the free energy 
is not superconcentrated in the E-A model on bounded degree graphs. This 
generalizes a well-known result of Wehr and Aizenman [38] , who proved the 
analogous result on square lattices. The relative advantage of our approach is 
that it does not use the structure of the graph, whereas the Wehr- Aizenman 
proof depends heavily on properties of the lattice. 

Theorem 1.8. Let F(ft) denote the free energy in the Edwards- Anderson 
model on a graph G, defined in Q. Let d be the maximum degree of G. 
Then for any (3, including (3 = oo (where the free energy is just the energy 
of the ground state ), we have 

The above result is based on a formula (Theorem 13. 1 1|) for the variance 
of an arbitrary smooth function of Gaussian random variables. 

1.6. A note about other models. It will clear from our proofs that the 
chaos and superconcentration results hold for the p-spin versions of the S-K 
model for even p. (See Chapter 6 of [34] for the definition of these models 
and various results.) In fact, a generalization of Theorem 11.41 is proven in 
Theorem 13.51 later, which includes the p-spin models for even p. 

It will also be clear that the lack of superconcentration is true in the 
Random Field Ising Model on general bounded degree graphs. (Again, the 
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lattice case is handled in [38]. We refer to [38J for the definition of the 
RFIM.) The absence of superconcentration in the RFIM implies that the 
site overlap is stable under perturbations, instead of the bond overlap as in 
the E-A model. 

A simple model where our techniques give sharp results is the Random 
Energy Model (REM). This is discussed in Subsection 13.141 

1.7. Unsolved questions. In spite of the progress made in this paper 
over [7J, many key issues are still out of reach. Some of them are as fol- 
lows: 

(1) Improve the multiple valley theorem (Theorem ll.ip so that 5n is a 
negative power of N, preferably better than iV -1 / 2 , which will prove 
'strong multiple valleys' in the sense of [7]. 

(2) Another possible improvement to Theorem 11.11 can be achieved by 
increasing to something of the form exp(iV a ). 

(3) Prove the chaos theorems (Theorems 11.31 and 11.4ft for the ground 
state (/? = oo) of the S-K model. 

(4) Improve the superconcentration result (Theorem II .5j) so that the 
right hand side is N a for some a < 1 . This is tied to the improvement 
of the chaos result. 

(5) If the above is not possible, at least prove a version of the supercon- 
centration result where the right hand side does not depend on (3, 
or has a better dependence than log (3. This will solve the question 
of chaos for j3 = oo. 

(6) Prove that the site overlap in the Edwards- Anderson model is chaotic 
with respect to fluctuations in the disorder, even though the bond 
overlap is not. 

(7) Prove disorder chaos in the S-K model with nonzero external field, 
that is, if there is an additional term of the form h^ai in the 
Hamiltonian. The general nature of the S-K model indicates that 
any result for /i / may be substantially harder to prove than 
for h = 0. (Reportedly, a sketch of the proof in this case will appear 
in the new edition of |34J.) 

(8) Show that in the E-A model, the variance of (Qi,2) tends to zero 
and the graph size goes to infinity. 

(9) Establish temperature chaos in any of these models. 

The rest of the paper is organized as follows. In Section [21 we sketch the 
proofs of the main results. In Section [3l we present some general results 
that cover a wider class of Gaussian fields. All proofs are given in Sectional 

2. Proof sketches 

In this section we give very short sketches of some of the main ideas of 
this paper. 
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2.1. Multiple valleys from chaos. Suppose we choose a from the Gibbs 
measure Gat at inverse temperature (3 and cr 2 from the measure G' N obtained 
by applying a continuous perturbation up to time t. Let Hn and H' N be the 
two Hamiltonians. Suppose (3 = /3(N) — ► oo and t = t(N) — > sufficiently 
slowly so that chaos holds (i.e. E(i?f 2 ) as ^ oo). Clearly this is 
possible by Theorem 11.41 Then due to chaos, cr 1 and cr 2 are approximately 
orthogonal. Since [5 — > oo, cr 1 nearly minimizes f/jv and <x 2 nearly minimizes 
i?^r. But, since t — > 0, -f/jv ~ i/^. Thus, cr 1 and <r 2 both nearly minimize 
-f/jv. This procedure finds two states that have nearly minimal energy and 
are nearly orthogonal. Repeating this procedure, we find many such states. 
The details are of this argument are worked out in Subsection 13.31 

2.2. Superconcentration iff chaos under continuous perturbations. 

Let (j)(t) denote E(i? 2 2 ) when cr 1 is drawn from the unperturbed Gibbs 
measure at inverse temperature and cr 2 is drawn from the Gibbs measure 
continuously perturbed up to time t. Let Fn(P) be the free energy defined 
in ©. Then we show that 

poo 

(9) Vav(F N (/3)) = N e'^ifydt. 

Jo 

The proof of this result (Theorem I3.8|) is simply a combination of the heat 
equation for the Ornstein-Uhlenbeck process and integration- by-parts. The 
formula directly shows that Var(i ? /v(/9)) = o(N) whenever <j>(£) falls of 
sharply to zero, which is a way of saying that chaos implies superconcentra- 
tion. 

In Subsection 13.11 we show that is a nonnegative and decreasing func- 
tion. This proves the converse implication, since the integral of a nonnega- 
tive decreasing function can be small only if the function drops off sharply 
to zero. 

2.3. Chaos under continuous perturbations. Suppose cr 1 is drawn from 
the Gibbs measure of the S-K model at inverse temperature (3, and cr 2 from 
the measure continuously perturbed up to time t. Let be the overlap 
of cr 1 and cr 2 , as usual, and let 

<j> k (t) := E(R 2 %). 

We have to show that for all t, 

4>k{t) < CN~ kmin{1 > t/C} 

where C is some constant that depends only on /3. 

By repeated applications of differentiation and Gaussian integration-by- 
parts, we show that (— l) 3 '^?'' (t) > for all t and j. Here <p£ denotes the 
jth derivative of 4>k- Such functions are called completely monotone. Now, 
by a classical theorem of Bernstein about completely monotone functions, 
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there is a probability measure [i k on [0, oo) such that 

(10) Mt)=M0) e- xt d^ k (x). 

Jo 

By Holder's inequality and the above representation, it follows that for < 

t<8, 

In other words, chaos under large perturbations implies chaos under small 
perturbations. Thus, it suffices to prove that (j>k(s) < const. N~ k for suffi- 
ciently large s. 

The next step is an 'induction from infinity'. It is not difficult to see that 
when t = oo, after integrating out the disorder, cr 1 and cr 2 are independent 
and uniformly distributed on {—1, 1}^. From this it follows that <^.(oo) = 
const. N~ k . We use this to obtain a similar bound on <pk( s ) f° r sufficiently 
large s, through the following steps. First, we show that for any k and s, 

<f>' k (s) > -2Np 2 e~ s 4> k+1 (s). 

Thus, we have a chain of differential inequalities. It is possible to manipulate 
this chain to conclude that 

^-2\2fc / i \ _2\2\ 

Ms) < r» E (^ir-) «p(a^e-f=-jf>-). 

cr 1 ,cr 2 

The right hand side is bounded by const. N~ k if and only if s is sufficiently 
large. (This is related to the fact that when Z is a standard Gaussian random 
variable, K(e aZ ) < oo if and only if a < 1/2.) This completes the proof 
sketch. The details of the above argument are worked out in Subsection 13. 11 

2.4. Chaos in E-A model. The proof of Theorem 11.61 again, is based on 
the representation Q of the variance of the free energy and the representa- 
tion (|10p of the function <p (both of which hold for the E-A model as well). 
From (|10p . it follows that there is a nonnegative random variable U such 
that for all t > 0, 

From this and ([9]) it follows that 

VarF(/J) = iV0(O)E((l + U)' 1 ). 

Next, we prove a simple analytical fact: Suppose V is a nonnegative random 
variable and let v := E((l + V)^ 1 ). Then for any t > 0, 

E(e- tv ) > ive-* 2 -")/". 

Using this inequality for the random variable f7 and the lower bound on the 
variance from Theorem 11.81 it is easy to obtain the required lower bound 
on the function (f)(t), which establishes the absence of chaos. The details of 
this argument are presented in Subsection 13.71 
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The proof of Theorem 11.71 involves a new idea. Let g = Q/y)(ij)e.E> and 
let g',g" be independent copies of g. For each t, let 



g* := e~*g + \A - e~ 2t g', g~* := e~*g + VI - e^g". 

For each t GR, let <r denote a configuration drawn from the Gibbs measure 
defined by the disorder g*. For i / s, we assume that <r* and a s are 
independent given g, g',g". Define 



By a similar logic as in the derivation of (jlOp . one can show that cf> is a 
completely monotone function. Also, ^ is bounded by 1. Thus, for any 

ail ' wi<«»<;. 

Now fix i and let 

u ijfci := E((a|a*-4(rf} | g), v ijM := E((a*aj)(^^) | g). 
It turns out that 

= LEI — E Hi^ijki ~ Vijki) 2 ) 

and 

E((Q CTV - t - (q^, ct -*)) 2 > = 7^2 E E (4« - 

' ' (i,j)&E,(k,l)&E 

where Q^t ^-t is the bond overlap between cr* and <x~*. Combining these 
two identities with the inequality (|lip . it is easy to complete the proof of 
Theorem II .71 The details are in Subsection 13.111 

2.5. Chaos under discrete perturbations. Let g = (gij)i<i,j<N, and let 

g' be an independent copy of g. For any A C : 1 < i,j < re}, let g A 

be the array whose (z,j)th component is 



4 ■= } ^ 



Let F/v be the free energy, considered as a function of g. Suppose ejv and 
5at are constants such that for all 



dF, 



N 



J'.) 



< N 1/2 5 N and 



d 2 F, 



N 



< A fl / 2 eAr almost surely. 



Fix < A; < N 2 , and let A be a subset of : 1 < i,j < iV}, chosen 

uniformly at random from the collection of all subsets of size k. Let cr 1 
be chosen from the Gibbs measure at inverse temperature (3 defined by the 
disorder g, and let cr 2 be drawn from the Gibbs measure defined by g A . Let 
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Ri t 2 denote the overlap of a 1 and a 2 , as usual. The key step is to prove 
that for some absolute constant C, 

CN 

l(#f, 2 ) < -r-Vax(F N ) + CN 2 5 N e N . 

This inequality is the content of Theorem 13.141 The proof is completed by 
showing that we can choose Sjy and e^v such that 5n^n = o(N~ 2 ), and using 
the superconcentration bound (Theorem II .5|) on the variance of F^. The 
details of the proof are given in Subsection 13.121 

2.6. No superconcentration in the E-A model. Although this result 
was already proven in [38] for the E-A model on lattices, it may be worth 
sketching our argument for general bounded degree graphs here. Our proof 
is based on a general lower bound for arbitrary functions of Gaussian random 
variables. The result (Theorem l3.12p goes as follows: Suppose / : 1" — > R is 
an absolutely continuous function such that there is a version of its gradient 
V/ that is bounded on bounded sets. Let g be a standard Gaussian random 
vector in W 1 , and suppose E|/(g)| 2 and E|V/(g)| 2 are both finite. Then 

Var(/(g)) > ^E(lE(^f£)) 2 > ^(E(g- V/(g))) 2 , 

where x • y denotes the usual inner product on W 1 . We apply this result to 
the Gaussian vector g = (jgij)u,j)eEj taking the function /(g) to be the free 
energy F((3). A few tricks are required to get a lower bound on the right 
hand side that does not blow up as (3 — > oo. 

Incidentally, the above lower bound on the variance of Gaussian func- 
tionals is based on a multidimensional Plancherel formula that may be of 
independent interest: 

k=l l<ii,...,ife<n 

Versions of this formula have been previously derived in the literature using 
expansions with respect to the multivariate orthogonal Hermite polynomial 
basis (see Subsection 13.81 for references). We give a different proof avoiding 
the use of the orthogonal basis. 

3. General results about Gaussian fields and proofs 

The results of Section [T] are applications of some general theorems about 
Gaussian fields. These are presented in this section, together with the proofs 
of the theorems of Section [TJ Unlike the previous sections, we proceed 
according to the theorem-proof format in the rest of the paper. 
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3.1. Chaos in Gaussian fields. Let S be a finite set and let X = (JQ)j £ g 
be a centered Gaussian random vector. Let 

Pi d := Cov(Xi,Xj), 

Let X' be an independent copy of X, and for each t > 0, let 



X* := e _t X + VI - e- 2 *X'. 

Fix /3 > 0. For each i, s > 0, define a probability measure Gt, s on S x S 
that assigns mass 

BX*+px a . 



to the point (i, j), for each (i,j) £ S x S. The average of a function : 
S" x S —* R under the measure will be denoted by (h)t,s, that is, 



(M,,- E ' jM!,j)e 



We will consider the covariance kernel p as a function on S x S, defined as 
p(hj) := Pin- Alternatively, it will also be considered as a square matrix. 

Theorem 3.1. Assume that pij > for all For each i, let 

Vi := — — -tt 

Let (j){x) = YlkLo c k xk be any convergent power series on [0, oo) all of whose 
coefficients are nonnegative. Then for each t > 0, 

m°PWt < inf(E(<Aop) 0>0 ) 1 -' /s ^^Oe 2 ^ e " s ^^^Y / ^ 

Moreover, M(cf) o p) ^ is a decreasing function oft. 

Roughly, the way to apply this theorem is the following: prove that the 
right hand side is small for some large t using high temperature methods, 
and then use the infimum to show that the smallness persists for small t as 
well. 

Since the application of Theorem 13.11 to the S-K model seems to yield a 
suboptimal result (Theorem [L4]), one can question whether Theorem 13 . 1 1 can 
ever give sharp bounds. In Subsection 13.141 we settle this issue by showing 
that Theorem 13.11 gives a sharp result for Derrida's Random Energy Model. 

Let us now proceed to prove Theorem 13. 11 In the following, C£°(R 5 ) will 
denote the set of all infinitely differentiable real- valued functions on R with 
bounded derivatives of all orders. 

Let us first extend the definition of X* to negative t. This is done quite 
simply. Let X" be another independent copy of X that is also independent 
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of X', and for each t > 0, let 



X- f := e~*X + VI - e- 2 *X". 

Let us now recall Gaussian integration by parts: If / : R 5 —> R is an 
absolutely continuous function such that |V/(X)| has finite expectation, 
then for any i £ S, 

E(X i /(X)) = ^p ii E(a i /(X)), 

where djf denotes the partial derivative of / along the jth coordinate (see 
e.g. [33], Appendix A. 6). The following lemma is simply a reformulated 
version of the above identity. 

Lemma 3.2. For any f £ C^°(R S '), we have 

^E(/(X-*)/CX*)) = -2e~ 24 J>,E(aj(X-%/(X<)). 



Proof. For each i > 0, define 

Y* := sfl - e- 2 *X - e-*X', Y~* := - e~ 2 'X - e^X". 
A simple computation gives 

^E(/(X-*)/(X f )) 

= -_ 7 =L=E((Y- t • V/(X-*))/(X*) + (Y* • V/(X*))/(X-*)) 
2e~' 



E((Y-*-V/(X- i ))/(X*)). 



(Note that issues like moving derivatives inside expectations are easily taken 
care of due to the assumption that / G C£°.) One can verify by computing 
covariances that Y~* and the pair (X~*,X') are independent. Moreover, 



X* = e- 2t X-* + e~Vl " e- 2i Y-* + y/l - e~ 2t X' . 
So for any i, Gaussian integration by parts gives 



E(rr*^/(x-*)/(x*)) = e -Vi-e- 2 '^^E(^/(x-*)a,/(x')). 

j 

The proof is completed by combining the last two steps. □ 

Our next lemma is the most crucial component of the whole argument. It 
gives a way of extrapolating high temperature results to the low temperature 
regime. 

Lemma 3.3. Let T be the class of all functions h on [0, oo) that can be 
expressed as 



h(t) =^e-^E(/ i (X-*)/i(X t )) 



i=l 
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for some nonnegative integer m and nonnegative real numbers c±, C2, • • • , c m , 
and functions /i, . . . , f m in C£°(M. S ). For any h G T , there is a probability 
measure p on [0, oo) such that for each t > 0, 

h(t) = h(0) [ e~ xt dn{x). 

In particular, for any < t < s, 

h{t) < /i(0) 1 - t/s /i(s) t/s . 

Proof. Note that any h G T must necessarily be a nonnegative function, 
since X~* and X* are independent and identically distributed conditional 
on X, which gives 

E(/(X- t )/(X t )) = E((E(/(X*) | X)) 2 ). 

Now, if h(0) = 0, then h(t) = for all t, and there is nothing to prove. So 
let us assume h(0) > 0. 

Since p is a positive semidefinite matrix, there is a square matrix C such 
that p = C T C. Thus, given a function /, if we define 

j 

then by Lemma 13.21 we have 

^E(/(X-*)/(X*)) = -2e- 2 *^E(^(X- t ) 5l (X')). 

i 

From this observation and the definition of J-, it follows easily that if h € J-, 
then —h! G T. Proceeding by induction, we see that for any k, (—l) k h^ is 
a nonnegative function (where h^ denotes the kth derivative of h). Such 
functions on [0, oo) are called 'completely monotone'. The most important 
property of completely monotone functions (see e.g. Feller [12J, Vol. II, 
Section XIII. 4) is that any such function h can be represented as the Laplace 
transform of a positive Borel measure p on [0,oo), that is, 

h(t) = ( e~ xt dv{x). 
J\0,oo) 

Moreover, h(0) = u(K). By taking p(dx) = h(0)~ 1 i'(dx), this proves the first 
assertion of the theorem. For the second, note that by Holder's inequality, 
we have that for any < t < s, 

e~ xt dp(x) < (J e~ xs dp(x)^j ' = (h(s)/h(0)y/ s . 

This completes the proof. □ 

The next lemma is obtained by a variant of the Gaussian interpolation 
methods for analyzing mean field spin glasses at high temperatures. It is 
similar to R. Latala's unpublished proof of the replica symmetric solution 
of the S-K model (to appear in the new edition of [34]). 
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Lemma 3.4. Let <f> and Vi be as in Theorem \3.1[ Then for each t > 0, 
Proof. For each i, define a function pi : — > M as 

ft W : = 

Note that 

djPi = 0(pi6ij -piPj), 

where <5j,- = 1 if i = j and otherwise. Since pi is bounded, this proves in 
particular that p { £ C fc °°(lR 5 ). 

Take any nonnegative integer r. Since p = (p%j) is a positive semidefinite 
matrix, so is pv) := (pL)- (To see this, just note that X 1 ,...,X r are 
independent copies of X, then Cov(X l 1 • • • X?,Xj ■ ■ ■ X r - ) = p^.) Therefore 



there exists a matrix CM = (C^ r) ) such that = {C^) T C^ r \ Define the 
functions 

*(r) 



.1 



In the following we will denote pi(X. s ) and /i,(X s ) by pf and hf respectively, 
for all s£R. Let 

f r {t) := e(J2 pWp]) = e(E K'hl) ■ 

i,j i 

By Lemma 13, 21 we get 

f' r {t) = -2e- 2t Y,Y.P^ d ^ td ^ 

i k,l 

i,k,i vv j j 

-E*^' P M + E Pklp r jmPj t Pk t p t mPl 
j,k,l j,k,l,m 

Our objective is to get a lower bound for f' r {t). For this, we can delete the 
two middle terms in the above expression because they contribute a positive 
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amount. For the fourth term, note that by Holder's inequality, 



k,l j,m 



1 



Thus, by Lemma 13.31 and the above inequalities, we have 

(13) > /;(*) > -4(3 2 e- 2t f r+1 (t). 
Now let g r (u) := f r (— log s/u) for < u < 1. Then 

The inequality (fT3l) simply becomes 

(14) < g' r (u) < 2(3 2 g r+1 (u). 
Fix < u < 1, r > 1. For each m > 1, let 

T m := / • • • / (2(3 2 ) m ' 1 g' r+m __ 1 (u m )du m du m - 1 ■■■du 1 . 
Jo Jo Jo 

Using ([H]), we see that 

<T m < / • • • / (2(3 2 ) m g r+m (u m )du m du m - 1 ■ ■ ■ du\ 
Jo Jo Jo 

= I ■ J (2(3 2 ) m i g r + m (0) + / g' r+m (u m+ i)du m+1 jdu m ■ ■ ■ du\ 

_ (2p 2 rg r+m (0)u m 

— 1 r J-m+X- 

ml 

Inductively, this implies that for any to > 1, 

5rW = 3r(0) + Ti < 7j V "mil- 

Again, for any m > 2, 

2\m. 



0<T m < / • • • / (2/3 ) m g r+m (u m )du m du m -.i ■■■du 1 
Jo Jo Jo 

M r+m {2l3 2 u) m 



< 



ml 

where M = maxjj Thus, linim^oo T m = 0. Finally, observe that p?° and 
p~°° are independent. This implies that for any to, 



5m(0) = / m (oo) = ^PijUiUj. 



l :J 
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Combining, we conclude that 

9r (u) < ± ir±M^ = £ Pl^nvr 

1=0 ' i,j 

The result now follows easily by taking u = e~ 2t and summing over r, using 
the fact that (p has nonnegative coefficients in its power series. □ 

Proof of Theorem 13.11 Let p\ be as in the proof of Lemma 13.41 As noted in 
the proof of Lemma 13.41 the matrix (pL) is positive semidefinite for every 
nonnegative integer r. Since <f> has nonnegative coefficients in its power 
series, it follows that the matrix <5 := (<j>(pij)) is also positive semidefinite. 
Let C = (Cij) be a matrix such that $ = C T C. Then 

(0 o p)_ M = £ tMprttf =E(E <W) (E • 
»,i i j 3 

Therefore, the function 

h(t) :=E(^op)_ tjt 

belongs to the class .F of Lemma 13.31 The proof is now finished by using 
Lemma 13.31 and Lemma |3.4|, and the observation that h(t/2) = M{cj> o p)o ) t 
(since (X-*/ 2 ,X*/ 2 ) has the same law as (X°,X*)). The claim that h(t) is 
a decreasing function of t is automatic because h' < 0. □ 

3.2. Proof of Theorem 11.41 We are now ready to give a proof of Theo- 
rem [T31 using Theorem 13.11 In fact, we shall prove a slightly general result 
below, which also covers the case of p-spin models for even p, as well as 
further generalizations. 

Let N be a positive integer and suppose (i?Ar(cr)) (Tg |_ ljl }jv is a centered 
Gaussian random vector with 

Cov(H N (a),H N (a')) = N^R^), 

where £ is some function on [—1, 1] that does not depend on N and Ra-,a-' is 
the overlap defined in (J3j) - Let us fix (3 > 0. The Hamiltonian Hn defines 
a Gibbs measure on {—1, 1}^ by putting mass proportional to e -P H N(&) a ^ 
each configuration a. This class of models was considered by Talagrand 
|35j in his proof of the generalized Parisi formula. For the S-K model, 
= x 2 /2, while for the p-spin models, = x p /p\. (We refer to Chapter 
6 in [M] for the definition of the p-spin models and related discussions.) 
Let H' N be an independent copy of Hjy, and for each t > 0, let 

H* N := e- f H N + y/l - e~ 2t H' N . 

Given a function h on {—1, 1} x{— 1, 1}^, we define the average (h(a, o~'))t,s 
as the average with respect to the product of the Gibbs measures defined 
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by H l N and H S N , that is, 



(Kcr,a% 



The following result establishes the presence of chaos in this class of models 
under some restrictions on £. It is easy to see that the result covers all p-spin 
models for even p, and in particular, the original SK model. 

Theorem 3.5. Suppose £ is nonnegative, £(1) = 1 and there is a constant 
c such that < cx 2 for all x £ [—1,1]. Then there is a constant C 
depending only on c such that for all t > and (3 > 1, and any positive 
integer k, 

E((£(iW)) fc )o,t < (CA;) fc Ar- femin {i,t/ciog(i+c/3)}_ 
Proof. By symmetry, it is easy to see that for each a, 



e 



E ^ 7777 7 7T =2 



-N 



Again, it follows from elementary combinatorial arguments that there are 
positive constants 7 and C that do not depend on N, such that for any 
positive integer k and any N, 

<t,<t' ' \ / 

Choosing s so that 20 2 ce~ s = min{7, 2/3 2 c}, and 4>(x) = x k /N k , we see from 
Theorem 13.11 (and the assumption that < cx 2 ) that for any < t < s 

E((C(^, CT ')) fc )o,t 

£ ( E (( ?( ^0)V)^(^--E^e XP (^))' /< 

< {Ck) k N~ kt / s , 

where C is a constant that does not depend on N. This proves the result 
for t < s. For t > s we use the last assertion of Theorem 13.11 to conclude 
that E((£(i? (Ti<T /))' c )o i t is decreasing in t. Finally, observe that 

_ Jo if2/3 2 c<7, 
S ~ |log(2/3 2 c/ 7 ) if2/? 2 c>7 

< Clog(l + C/3) 

for some constant C that depends only on c and 7. □ 
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3.3. Multiple valleys in Gaussian fields. In this subsection we use The- 
orem [33] to prove a multiple valley result for general Gaussian fields. Let all 
notation be the same as in Subsection 13.1 1 The idea of the proof is borrowed 
from the proof of Theorem 3.7 in [7], although there are added complications 
resulting from the fact that we are trying to derive a result about (5 = oo 
from a result about finite /3 (i.e. Theorem 13. ip . 

Theorem 3.6. Let r be a positive integer, and let e, S G (0, 1). Choose any 
(5 > and t > 0. Let M := maxjXj, m := E(M) and a 2 := maxj Var(Xj). 
Define 



8rlog\S\ 2rt 2/8a 2 _r*_ 

5m(3 S ^ 2ea 2 



-t.t 



where the Gibbs average in the last term is taken at inverse temperature (3. 
Then with probability at least 1 — 7, there exists a set A C S of size r 
such that for any distinct i,j G A, we have pij < ea 2 , and for all i G A, 
Xi > (1 - 8)M. 

Proof. Given X, let / be a random variable chosen from the set S such that 

F(I = i\X) = 

Next, let M := max^Xj and define a random function 

F(J3) :=log^e^. 

i 

Then we have 

0M = log e m < log 

i 

<log(|5|e/ 3M )=log|5|+/3M. 

Thus, 

\F(0)-PM\ <log|5|. 
An easy verification shows that 

F"((3) = E(X| I X) - (E(X/ I X)) 2 > 0. 

Therefore F' is an increasing function of (5 and hence 



F'{(3) > \ f F'(x)dx 

P J (3/2 



F(f3)-F((3/2) 41og|5| 

Combining this with the observation that F'(/3) = E(X/ | X), we have 
E(M - X/) = E(M - < ll2|M. 
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Now let ZW, . . . , be i.i.d. copies of X. Let 



X<*> := e _ *X + Vl-e- 2 *Z( fe ), 

and 

:= Vl - e" 2i X - e-*Z( fc ). 

Then X( fc ) and Y< fc ) are independent (jointly Gaussian and all covariances 
vanish), and 

X = e"*xW + Vl-e- 2 *YW. 

Let be a random variable on 5" whose conditional distribution given 
X( fc ) is the same as that of I given X. In particular, by the independence of 
XW and Y^, and Y( fc ) are also independent. From this observation 
and the above representation of X, we have 

E(X% - X m ) = (1 - e-*)E(X$)) - V^T^ECF/g) 

= (1 - e~*)E(Xj) < tm. (Recall: m = E(M).) 

(k) 

The last equality holds because X,^ and X/ have the same unconditional 
distribution. For the same reason, we have 

E( M-i;;i)=E(M-i f )<M. 

Combining the last two inequalities, we see that 

E(M-X m ) < Alog J S \ +tm. 

Thus, 

E±(M-X m )<^l^ + rtm. 
fc=i ' 
Now, we clearly have that for any k 7^ I, 

E(/Oj(fc)j(o) = E(p)_ tit . 

Thus, 

E 2 P/W/w = r(r ~ 1) E(p)-t, f . 

l<fc<Kr 

Finally, by Gaussian concentration (see Proposition 1.3 in [7]), we have 
P(M — m < —x) < e~ x l 2a . Combining all steps, we get 

P(minX /(fe) < (1 - S)M) < P(mmX j(fc ) < M — Sm/2) 

k k 



+ P(2M < m) 
< f(J2(M - X lW ) > 6m/2 \ + e"™ 2 / 8 - 2 



k=l 



< 8r log|5| 2rt e _ m 2 /8(T 2 
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and 



PC max / o J (fc)j(t) > ea 2 ) < PI V PjW/O) > ecr 2 I 

l<K<<<r \ ' — » / 

v l<fc</<r 7 

2 

< A:E(p)- t t. 
- 2ecr 2 XP/ ' 

Putting together the last two bounds, we see that the set A := {/CO , . . . , jW } 
satisfies the requirements of the theorem. □ 

As a corollary of Theorem 13.61 we now prove that multiple valleys exist 
at 'all levels'. 

Corollary 3.7. Let all notation be the same as in Theorem 13.61 Fix any 

< a < 1. Let 



7 :=7 



o~m 

T/ien wii/i probability at least 1 — 7', i/tere exists a set A C S 1 0/ size r 
suc/i i/iaf /or any distinct i,j £ A, we have pij < ea 2 , and for all i £ A, 
\Xi -aM| < 55\M\. 

Proof. Let X' be an independent copy of X, and let 



Y := qX + VI - a 2 X'. 

Note that Y has the same distribution as X. Let A be a set as in Theo- 
rem [321 Let My := maxj Y{. Then for any i £ A, 



\Yi -aM Y \ < a\Xi -Ml +a|M y - M| + Vl - a 2 max|X-| 

jeA 

< 5M + I My - Ml + max \X'A. 

j£A J 

By Gaussian concentration (see e.g. Proposition 1.3 in [7j), we have 
PflMy - M| > <5m) < 4e- <52m2/8,T2 

and 

P(|M| < m/2) < 2e- m2/8,j2 . 
Moreover, by the independence of X' and X and a standard result for Gauss- 
ian random variables (see e.g. Lemma 2.1 in [7]), we get 



E(max|X'|) < E(a^/2hg\A\) = a ^2 log', 



Therefore, 



C ly/K 1 \ / cr^logr 
(max I A A > dm) < 



'jeA 3 ~ 8m 

From the above steps and Theorem 13.61 we have that with probability at 
least 1 — 7', there is a set A C S of size r such that for any distinct i,j £ A, 
we have < ea 2 , and for each i £ A, \Yi — aMy\ < 45m. Since Y and X 
have the same distribution, this completes the proof. □ 
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3.4. Proofs of Theorem 11.11 and Corollary 11.21 These are direct ap- 
plications of Theorem 13.61 and Corollary 13.71 Consider the Gaussian field 
(HN( cr ))a-e{-i,l} N defined in (|4|), and choose 

= e V^iN^ r = [(log jV) 1 ^ $ = (logN)- 1 / 8 , 

t = (logiV)- 1 /3 ) e = e -(logiV)VB_ 

Note that a 2 = N/2 and \S\ = 2 N . Note that the quantity /v<t', according 
to the notation of Theorem 13. 11 is just NR^ _,. Thus, with the above value 
of t, Theorem 11.41 savs that for some absolute constants C, c, 

^{p)-t,t < CN- C ^ N ^ 5/6 = Ce- C ^ N ^\ 

Again, by the Sudakov minoration technique (see e.g. Lemma 2.3 in [7J) it 
is not difficult to prove that m > cN for some positive absolute constant c. 
Invoking Theorem 13.61 we now get 



IN < CilogN^e-^^ + C7(logiV)- 1 / 12 

+ Ce~ cN + CQ.ogN) 1 / i e a O°* N ) l/a e -cQ<*N) 1/e 
< C(logiV)- 1 / 12 , 

where C and c denote arbitrary absolute constants. This completes the 
proof of Theorem II. li To prove Corollary 11.21 note that the quantity 7' in 
Corollary 13.71 can be bounded by 

77v + Ce- c7V ( logAr )" 1/4 + CW-^log iV) 1/8 loglogiV. 
Since m > cN as noted before, this completes the proof of Corollary 11.21 

3.5. Superconcentration in Gaussian fields. Carrying on with the no- 
tation of Subsection 13.11 we have the following formula for the variance of 
the free energy associated with a Gaussian field at inverse temperature (5. 
This is a direct analog of Lemma 3.1 in [7J. We follow the notation of 
Subsection 13.11 

Theorem 3.8. Take any /? > 0. Let 

F(X) :=ilog£e^. 



Then 



poo 

VarF(X) = / e -*E(p) , t dt. 
J 

Proof. Note that by Lemma 13.21 for any smooth / we have 

f°° d 

Var(/(X)) =-J ±E(/(X-*)/(X*))(ft 

poo 

= 2 / e - 2t V Pij E(a i /(x- t )a J /(x*))dt. 

J TT 
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Taking / = F, we get 

'£p ij a i F(X7 t )d j F(X?) = {pU )t . 

hi 

Again, since (X~*,X*) has the same joint law as (X,X 2i ), we see that 
K{p)-t,t = E(p)o,2t- Combining the steps, we get 

/•oo 

Var(F(X)) = / e^E^o.tdt. 
Jo 

This completes the proof. □ 

3.6. Proof of Theorem 11.51 This is just a combination of Theorem 13.81 
above and Theorem 11.41 

3.7. Chaos implies superconcentration. The goal of this subsection is 
to prove that in the absence of superconcentration, we do not have chaos 
either. This is an improved version of Theorem 3.2 in \7\. where the absence 
of chaos was proved only up to a finite time, but not for all t. 

Lemma 3.9. Suppose U is a nonnegative random variable and let v := 
E((l + C/)- 1 ). Then for any t > 0, ¥.(e~ tu ) > \ve- t{ - 2 -^/ v . 

Proof. Note that 

E(e- tu )= C F(e- tu > y)dy 
Jo 

= [ mi + U)- 1 > (l-t-Hogyr^dy. 
Jo 

Now, for any e > 0, we have 

E((l + I!)" 1 ) < e + P((l + U)- 1 > e). 
Thus, if e < v/2, then 

P (( 1 + C /)-l> e )>|. 

Now (1 — t _1 logy) _1 < v/2 if and only if y < e~ t ( 2 ~ v ^ v . Combining the 
steps, we see that 

-t(2-v)/v 

This completes the proof of the lemma. □ 

Theorem 3.10. Let all notation be as in Subsection 13.11 Let F(X.) be as 
in Theorem \3.8[ Take any (3 £ (0, oo), and define 

__ VarF(X) 

V ''~ E(p) ,o ' 
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E(p)o, t >^Var(F(X))e-'( 2 ^. 



Then for all t > 0, 



Proof. By Theorem! 

/■oo 

VarF(X) = / e - *E(p)o,t<ft. 

By Lemma |3.3|, we see that there is a non- negative random variable U such 
that for all t, 

E(p) ,i = E(p) ,oE(e- iC/ ). 
Combined with the formula for the variance derived above, this gives 

VarF(X) =E( /0 )o, E((l + C/)- 1 ). 
The result now follows from Lemma 13.91 □ 



3.8. A formula for the variance of Gaussian functionals. In this sub- 
section we present a general formula for the variance of a function of inde- 
pendent standard Gaussian random variables. After that, we derive a useful 
lower bound for the variance using this formula. 

The variance formula looks similar to those in Houre and Kagan [21] 
and Houdre [20] but it is not the same. Various versions of the formula have 
appeared in Houdre and Perez- Abreu ([22J, Remark 2.3) and Houdre, Perez- 
Abreu and Surgailis ([23], Proposition 10). Essentially, this is the Parseval 
identity for the L 2 norm of a Gaussian functional expressed as a sum of 
squares of its Fourier coefficients in the orthogonal basis of multidimensional 
Hermite polynomials. We present a direct proof that does not involve the 
multivariate Hermite polynomial basis. Yet another proof, based on heat 
kernel expansions, was suggested to the author in a private communication 
by Michel Ledoux. 

Theorem 3.11. Let g = (jji, . . . ,g n ) be a vector of i.i.d. standard Gaussian 
random variables, and let f be a C°° function of g with bounded derivatives 
of all orders. Then 

^n-th s: ( e GA-)) 2 

k=l l<h,...,i k <n V V yn 

The convergence of the infinite series is part of the conclusion. 
Proof. Let g' and g" be i.i.d. copies of g, and for each t > 0, define 



g* := e~'g + \A - e- 2t g, g * := e *g + y/l - e~ 2t g" . 

Let 

4>{t) :=E(/(g-')/(g')). 
Then by Lemma 13.21 we have 

= -2e- 2t ^E(9 i /(g-*)a i /(g t )). 
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For < u < 1, define ip(u) = 4>(t(u)), where t(u) := — ^logu. Then 

ij/ (u) = -f-E(/(g-*M)/(g*<«>)) 
au 

= d) ( log u 

2u V 2 s 

= X) E (^/(s" t(tt) W(g t(u) ))- 



Repeating this step A; times shows that 

^ k \u) = Y, E(d H ■ ■ • ^/(g-'MR • • • 5 ifc /(g*W)). 

l<ii,...,ife<n 

As in the proof of Lemma [3. 3\ we observe that the expectations on the right 
hand side are always nonnegative. We can continuously extend ip to the 
closed interval [0, 1] by defining ip(0) := E(/(g')/(g")) = (E/(g)) 2 . Then 
is a continuous function on [0, 1] that is C°° in (0, 1) with all derivatives non- 
negative. Such functions are known as absolutely monotone (see Feller [12] . 
p. 223), and their most important property is that they can be represented 
as a power series 4>{u) = YlT=oPk uk i where the coefficients are non-negative 
and sum to if>(l). From this one can easily deduce that for any k > 1, 

l<ii,...,i fc <n 

Since po = ip(0) = (E(/)) 2 and = E(/ 2 ), this completes the proof. □ 

A great advantage of Theorem l3,lll is that we can extract lower bounds for 
the variance just by collecting a subset of the terms in the infinite sum. This 
is exactly what we do to get the following theorem. We do not actually need 
the theorem in its full generality (with respect to the smoothness conditions 
on /), but prove it in the general form nonetheless. 

Theorem 3.12. Suppose f : M n — > R is an absolutely continuous func- 
tion such that there is a version of its gradient V/ that is bounded on 
bounded sets. Let g be a standard Gaussian random vector in M. n , and 
suppose E|/(g)| 2 and E|V/(g)| 2 are both finite. Then 

1 n 1 
Var(/(g)) > -^(E(^/(g))) 2 > _(E(g- V/(g))) 2 , 

i=i 

where x • y denotes the usual inner product on M. n . 
Proof. First assume that / G C£° . Theorem 13.111 gives 

Var(/(g))>if;(E(9 2 /(g))) 2 . 
i=\ 

Integration by parts gives 

E(dff(g))=E((gf-l)f( E )). 
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Thus, for any C£° function /, 

1 n 

(15) Var(/(g)) > - £>(( ff ? - l)/(g))) 2 . 

»=i 

Let us now show that the above inequality holds for any bounded Lipschitz 
function /. For each t > and x G R n , define 

/ t (x) :=E(/(x + tg)). 

Then we can write 

* (x) = i /(x+ty) (^*' 

/ e 2*^ 1 

Since / is a bounded function, it is clear from the above representation that 
ft £ for any t > 0, and hence (fT5j) holds for / t . Again, since / is 
Lipschitz, 

|/ t (x)-/(x)|<LiE|g|, 

where L is the Lipschitz constant of /. This shows that we can take t — > 
and obtain ([T5]) for /. 

Next, we want to show (|15p whenever / is absolutely continuous and 
square-integrable under the Gaussian measure, and the gradient of / is 
bounded on bounded sets. Take any such /. Let h : W 1 — > [0, 1] be a 
Lipschitz function that equals 1 in the ball of radius 1 centered at the ori- 
gin, and vanishes outside the ball of radius 2. For each n > 1, define 

/ n (x) := /(xWn^x). 

Then note that each f n is bounded and Lipschitz (with possibly increasing 
Lipschitz constants). Thus, (fT5j) holds for each f n . Since < |/| every- 
where, and / n — > / pointwise, and / is square-integrable under the Gaussian 
measure, it follows that we can take n — > oo and get (|15p for /. 

Finally, we wish to show that if V/ is square-integrable under the Gauss- 
ian measure, we have 

E(( ffl 2 -l)/(g))=E( 5 ^/(g)). 

(Note that / is almost surely an absolutely continuous function of Qi if we 
fix ((/j)jjj. This follows from Fubini's theorem.) The above identity follows 
from the univariate identity 

E(Z<P(Z)) = E(<f>'(Z)) 

that holds when Z is a standard Gaussian random variable and <f> is any 
absolutely continuous function such that E|^>(Z)|, K\Z(p(Z)\ and E|^'(Z)| 
are all finite. The identity is just integration-by-parts when is absolutely 
continuous and vanishes outside a bounded set. In the general case, let 
(pnix) = 4>{x)h{x /n) , where h : R — ► [0, 1] is a Lipschitz function that is 1 on 



30 



SOURAV CHATTERJEE 



[— 1, 1] and vanishes outside [—2, 2]. Then the above identity holds for each 
n , and we can pass to the limit using the dominated convergence theorem. 
(Actually, it can be shown that the finiteness of E|<^'(Z)| suffices.) As a last 
step, we observe that by the Cauchy-Schwarz inequality, 

n 1 / n x 2 

]T(EMJ(g))) 2 > - ^E(<^/(g)) 



n , 

i=i v i=i 



This completes the proof. □ 

3.9. Proof of Theorem 11.81 Let g be as in the previous subsection. Let 
A be a finite subset of W 1 . Consider the function 

/^(x) := Ilog^e^. 
P yeA 

Lemma 3.13. For any (3 > we have 

Varfofe)) > sup — ^E * ; ,, y , x - v » y . x 



0<p><p2n\^ \J2yeA^' y - x lE y6 i^' x 
Proof. Note that 

J/3(X) = E yg ^* ' 

and therefore 

„, , , E, il (ri)'"- 9 , ^ 
x ' V/s(x)= E^I^ = ^'° g £ 

Now, it is easy to verify that log ^ e^ y ' x is a convex function of 0, and hence 
for each x, x • V/^(x) is an increasing function of /3. Thus, E(g • V/^(g)) is 
also an increasing function of (3. Moreover, 



Pyx 



E(g-V/ (g)) = T ^ T ^E(yg) = 



1 1 yeA 

and therefore E(g- V/ 1 g(g)) > for all /3 > 0. Finally note that by integration 
by parts, 

n 

E(g-V^(g)) = X>(d?//3(g)) 
i=i 



E y6 A ^ x V E ye A ^ 
Combined with Theorem 13.121 this completes the proof. □ 

We are now ready to complete the proof of Theorem 11.81 Consider an 
undirected graph G = (V,E), and the Edwards- Anderson spin glass model 
on G as defined in Subsection II. 41 Let denote the average with respect 
to the Gibbs measure at inverse temperature j3. First, we will work with 
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(3 < oo. Let F be as in Theorem 11.81 By Lemma I3.13( with n = \E\, 
g = (9ij)(i,j)eE and A = {(<?iVj)(i,j)eE : <r € {-1, 1} V }, we get 

VarF(/?)> sup ^(\E\- Y E^aj) 2 ) . 

Now, under the Gibbs measure at inverse temperature /?', the conditional 
expectation of <j{ given the rest of the spins is tanh(/3' YljeN(i) 9v a j)' wnere 
N(i) is the neighborhood of i in the graph G. Using this fact and the 
inequality |tanhx| < we get 



~E{(Ji(jj) 2 p, = E^tanh^V Y gikVk^jvj 



< /? /2 E 



2 



Y Itocl) 



/2 .2 



Thus, 



<0 2 d y ng ik \ 2 <rd 2 . 

feeiv(i) 



VarF(/3) > ^ sup /3' 2 (1 - (5' 2 d 2 ) 2 . 

2 0</3'<min{/3,l/d} 



Taking /3' = min{/3, l/2d}, we get 

Finally, to prove the lower bound for (3 = oo, just note that F([3) — > F(oo) 
almost surely, and the quantities are all bounded, so we can apply the dom- 
inated convergence theorem to get convergence of the variance. This com- 
pletes the proof of Theorem 11.81 



3.10. Proof of Theorem 11.61 For (3 < oo, this is just a combination 
of Theorem 11.81 and Theorem 13.101 (Note that the notations of the two 
theorems are related as E(p)o tt = {El^Qi^)', also note that E(p)o,o < \F\ 
in this case, and therefore v > Cq.) 

Next, note that as f3 — > oo, the Gibbs measure at inverse temperature j3 
converges weakly to the uniform distribution on the set of ground states. 
The same holds for the perturbed Gibbs measure. Thus, 

lim {Qi t2 )/3 = (£1,2)00 a.s., 

p— >oo 

where {Q 1,2)13 denotes the Gibbs average at inverse temperature j3. Since 
all quantities are bounded by 1, we can take expectations on both sides and 
apply dominated convergence. 
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3.11. Proof of Theorem II. 71 Let g = {gij)u t j)eE, and let g',g" be inde- 
pendent copies of g. For each t, let 



g* := e *g + \A - e~ 2t g', g * := e *g + \J\- e~ 2t g" . 

For each t £ R, let <r* denote a configuration drawn from the Gibbs measure 
defined by the disorder g*. For i / s, we assume that <r* and cr s are 
independent given g, g',g". Define 



:= M E E((a*^.)(arV7*)). 



LB 

(j,i)6S 

By Lemma f3.3l it follows that cj) is a completely monotone function on [0, oo). 
Also, 4> is bounded by 1. Thus, for any t > 0, 

(10) ^<«^<j. 

Again, if we let 

4jkl '■= ( <7 i°j fc z) - ( cr i cr j>(°i cr /)i 
then by Lemma 13.21 we know that 

2p- 2t B 2 ^ 

m = — r^f- £ E (4W*)- 

1 1 (i,j)eE,(k,l)£E 

Now fix t, and let := E(e* | g). Then E(e* e^) = E(e 2 ) and so 



by (fTCj) . we have 



(17) E E (4«)< 



Now let 



2te~ 2t 8 2 ' 

(i,j)£E,(k,l)&E 



u \jkl ■= (ViVjvWl) > ^'M : = ( CT i°j)( i°''>> 



and define := E(«* 3 - w | g) and «y fci := | g). Then |u„fcz| and 

are both uniformly bounded by 1, and so 

(i,j)s£, (M)e£ (i,i)eE, (M)e£ 

< 2 E E|% fc ; - Vij k i\. 
(i,j)eE,(k,i)eE 

Since Uijki — Vijki = e-ijkU an application of the Cauchy-Schwarz inequality 
and (fT7|) to the above bound gives 

2|_g|3/2 



(i,j)eE,(k,l)eE e l ^ 2t 

To complete the proof, note that 

7|U ( U ijkl u l/kl - ^ijklV'-kl) = ((Qa*,<r-* ~ (Q<rt,cT-t)) 2 ), 

1 1 (i,j)eE,(k,l)eE 
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where Q^t CT -t is the bond overlap between <x and a . 

3.12. Chaos under discrete perturbation. Our goal in this subsection is 
to prove that superconcentration implies chaos under discrete perturbations. 
Accordingly, let us first set the stage for discrete perturbation. Henceforth, 
we deviate from the notation of Subsection 13.11 

The result of this subsection and its proof are inspired by Lemma 2.3 
in [6]; we follow the same notation as in [6]. Let X = (X\, . . . ,X n ) be a 
vector of independent random variables with Var(Aj) = 1 for each i. Let X' 
be an independent copy of X. For any A C [n] := {1, . . . , n}, let X" 4 be the 
vector whose ith. component is 

Let / : W 1 — > R be a twice differentiable function. Let dif and dff be the 
first and second partial derivatives of / in the direction of the ith coordinate. 

Theorem 3.14. Suppose e and 5 are constants such that for all i, \d{f\ < 5 
and \dff\ < e everywhere in the closed convex hull of the support of X. 
Fix < k < n, and let A be a subset of [n], chosen uniformly at random 
from the collection of all subsets of size k. Define ~K A as above. Let 7 := 
maxiE|Xi - X 4 '| 3 . Then 

(18) E(f>/(X )(9i /(X^ < |±ivar( /( X)) + 

The proof of Theorem 13.141 is divided into a series of lemmas. First, let 
us introduce some further conventions. To simplify notation, we will write 
f A for f(K A ). When A = 0, we will simply write /. For any i and A such 
that i A, let 

A,/*: f A f 1 ^. 

As usual, when A = 0, we will simply write Aj/. Let A-ki denote the 
collection of all subsets of of size k. For < k < n — 1, efine 

n 

^:=E7^TY E ^AJA^), 

i=l \ k ) AeA k ,i 

The above quantity is a discrete proxy for the left hand side in (fT8|) . Our 
first result is an exact formula for the variance in terms of To, ... , T n _\. This 
is actually a restatement of Lemma 2.3 from [6]. 

Lemma 3.15. We have 

1 n— 1 
fc=0 
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Proof. By exchangeability of X% and X[, it is easy to see that the pair 
(/, Aif A ) has the same distribution as the pair (f^\ —Aif A ), and therefore 

(19) E(A 4 /A,/ A ) = E(/A l / A ) - E(/«AJ A ) = 2E(fA l f A ). 

We claim that 

1 n n—l 1 

(») ^EEt^tt E = /-/'"'■ 

i=l fc=o v fe / Ae^ fc ,i 

To see this, consider any B C [n] such that B / and B ^ [n]. Let k = \B\. 
On the left hand side in the above display, if we write out the definition of 
Aif A as f A - / Au « and regroup terms, then the coefficient of f B in the 
expansion is 

— ; — TT-(n — k) ; — r-k = 0. 

n( n - k T <-l) 
Similarly, the coefficient of / is 1 and the coefficient of is —1. This 
proves ([20]) . Combining ([20]) with (fT9|) . we see that 

^ n n—l ^ 

Var(/)=E(/(/-/H))= £ E^/A^). 

i=i fc=o v fe / Ae^fe.i 
This completes the proof of the lemma. □ 

Our next lemma is a monotonicity property of the T^'s. 
Lemma 3.16. We have T > 2\ > • • • > T n _i > 0. 

Proof. Take any A and i $ A. It is easy to see that given (Xj)jgA and X-, 
the random variables Aj/ and Aif A are i.i.d. Therefore, 

E(Aj/Aj/" 4 ) = E((E(Aj/ | (X,-)^,^)) 2 ). 

From this and Jensen's inequality, it is clear that E(Aj/Aj/ A ) > 0, and for 
any [n]\{i}, 

E(A i /A i / A ) > E(AifAif B ). 
Thus, if k := \ A\ < n — 2, we have 

E(Aj/Aj/ j4 ) > L^^E(A i /A J / B ), 

n — k — 1 

where the sum is taken over all B such that B = AL){j} for some j Au{i}. 
Since any B 6 ^4^+1^ can be obtained by adding one element to A for exactly 
k + 1 many A £ Ak,i, we have 

nAJA t f A )> k + \ Yl E(A,/A,/ B ). 

A A n - % - 1 r, A 

A&A kd BeAk+i,i 
This can be rewritten as 

-L- E(AJAJ A ) > -i- ^ E(AJAJ S ). 



(n—l\ / ^ v — ( J — " y — (n—l\ 



fc+l,i 
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This completes the proof of the lemma. □ 

Combining Lemma 13.151 and Lemma 13.161 we easily get the following 
discrete version of Theorem 13.141 

Lemma 3.17. For each < k < n — 1, 

2nVar(/) 

Proof. Since To > T\ > • • • T„_i > 0, and 



n-l 

2n 

fc=0 

it follows that for each < k < n — 1, 

fe 



1 ^ 2nVar(/) 
+ l^ Jr 



k - k + 1 ^ r - k + 1 

r=0 

This completes the proof of the lemma. □ 

Finally, we are ready to prove Theorem 13.141 This involves replacing 
the discrete derivatives in Lemma 13.171 with continuous derivatives, and 
incurring a small error along the way. 

Proof of Theorem 13.141 Since \d- L f\ < 5 and \dff\ < e everywhere on the 
closed convex hull of the support of X, by Taylor expansion we have 

\&if A \ < \Xi - Xi\S, \AJ A - (Xi - X'^dJ^ < e -{Xi - X[f. 

Thus, 

|E(Ai/Ai/ A ) - E((X { - XlfdifdJ^l 

<|e((aj-(x-x^/)a,/ a )| 

+ \E((Xi - X' l )d i f(A i f A - (Xi - X'^dJ^l 
< 5eE\Xt -X'f. 

Now let X'l be another independent copy of Xi, that is also independent 

. j± 

of X[. Let dif denote dif with Xi replaced by X" and define dif similarly. 

. j± 

Since Var(JQ) = 1 and (Xi — X[) 2 is independent of dif dif , we have 
E((Xi - Xl) 2 d7fa7f A ) = 2 E(o\fo\f A ) = 2 E(difdif A ). 

Again, 

\difdif A - i\fc\f A \ < 25e\X l - X'l\. 
Combining the last two observations, we get 

\E((X % - Xlfdifd^) - 2 E(difdrf A )\ < 25eE((X t - X' t ) 2 \Xi - X'l\) 

< 25eE\Xi - X'l 3 . 



(21) 
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And now, combining the above bound with (|2ip . we have 

(22) 2 E(difdif A ) < E(A i /A i / A ) + 35eE|X, - X'f. 

We also have to consider the case when i S A. Let B = A\{i}. Then by 
Jensen's inequality we have 

midif A ) = mmi i mm)) 2 ) 

< mndif i (Xj)rtB)) 2 ) = nd l fo l f B ). 

Now take 1 < k < n — 1 and let Ak denote the set of all subsets of [n] of 



size k. Using (|22j) and (|23|) . we get 

n n / 

i=l Ae^ fc i=l ME4i Aein, 



<E(E e ^w a ) + E 

i=l W,, Ae^ fc _ lj4 ' 



From this and Lemma 13.171 we conclude that for 1 < k < n — 1 



n + 1 3n5ej 
< 7— rVar(/) + 



k + 1 w ' 2 

The same conclusion can be drawn for fc = and k = n by defining T_i = 
T n = and verifying that all steps hold. This completes the proof. □ 

3.13. Proof of Theorem 11.31 Consider the S-K Hamiltonian Hn defined 
in (j3J) as a function of the disorder g = (gij)i<i,j<N- Fix (3, and define 
/ = N- l ' 2 F N {(3), where F N (/3) is the free energy defined in ©. Let g' be 
an independent copy of g, and define g" 4 as we defined in Theorem 13.141 
Let h = pN (and assume that k is an integer), and define a perturbed 
Hamiltonian using the disorder g A , where A is chosen uniformly at random 
from the set of all subsets of {(i,j)}i<i,j<N of size k. 

Let a 1 be sampled from the original Gibbs measure, and <x 2 from the 
perturbed Gibbs measure. An easy verification shows that 

E%/(g)^/(g A ) = (^? )2 ) J 

where dijf is the derivative of / with respect to the (i,j)th coordinate. On 
the other hand, by Theorem 11.51 we know that 

Clog(2 + C/3) 

Var/(g < . 

log A 
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Finally, note that for any 

0i i J ~ n ' ijI ~ N 3 / 2 

Therefore, we can take 5 = N^ 1 and e = f3N~ 3 / 2 while applying Theo- 
rem [3J3J Using all the above information, we can now apply Theorem 13. 141 
to conclude that 

E(Rl 2 )< Cl °f + N Cf,) + CpN-»>, 
p log Jy 

where C is an absolute constant. Since p £ (0,1), we can ignore the second 
term on the right after replacing log (2 + C(3) by C(l + /?) in the first term. 
This completes the proof. 

3.14. Sharpness of Theorem 13.11 for the REM. The Random Energy 
Model (REM) , introduced by Derrida [H [TO] , is possibly the simplest model 
of a spin glass. The state space is {—1, 1}^ as usual, but here the energies of 
states {— -£/jv(°")}<Te{-i i} N are chosen to be i.i.d. Gaussian random variables 
with mean zero and variance N. We show that Theorem 13.11 gives a sharp 
result in the low temperature regime (/? > 2-^/log 2) of this model. We follow 
the notation of Theorem 13.11 

Proposition 3.18. Suppose a 1 is drawn from the original Gibbs measure 
of the REM and a 2 from the Gibbs measure perturbed continuously up to 
time t, in the sense of Subsection 11.21 If (5 > 2-v/log 2, there are positive 
constants C{(3) and c((3) depending only on j3 such that for all N and t, 

c (P) e -CW»to{i.t} < E(l {CTl=tr2} } ,t < C(P)e- c ^ Nmin ^. 

Proof. In the notation of Theorem 13.11 we have p acT i = if er ^ cr' , and 
p^i = N if a = <r'. Also, clearly, v a = 2~ N for each cr. Suppose a 1 
is drawn from the original Gibbs measure and cr 2 from the Gibbs measure 
perturbed continuously up to time t. Taking 4>{x) = x/N in Theorem 13.11 
we get 

E(1 {ct1=ct2} ) , < inf(2- N e^- SN f s . 

Now choose s so large that 2(3 2 e~ s < ^log2. The above inequality shows 
that for t < s, 

(24) E(l {CTl=CT2} ) 0ii < 2~ Nt ' 2 % 
and for t > s, 

(25) E(l {CTl=CT2} ) 0)t < 2r» 

A simple computation via Theorem 13.81 now gives 

Var^/?)) < Ctf), 
where C(/3) is a constant depending only on (3. 
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Now suppose (3 > 2^/log2. Let H' N (a) = Hjq{tr) + Najy, where solves 

Na% = log I 



2 N 



Let (w%) i<a<2 N denote the numbers exp(— j3H' N (tr)) when enumerated in 
non-increasing order. It follows from arguments in Section 1.2 of Tala- 
grand [34] that this point process converges in distribution, as N — > oo, 
to a Poisson point process (w a ) a >i with intensity x~ m ~ l on [0, oo), where 
m = 2-y/log 2//3. It is not difficult to extend this argument to show that 

lim Var I log j = Var I log u; a J > 0. 

x ~ x V a=1 / V a=1 J 

We skip the details, which are somewhat tedious. (Here j3 > 2y / Iog~2 is 
required to ensure that the infinite sum Y1T w a converges almost surely.) 

However, Var (log w^) = Var ((3Fn (/?)). Thus, there is a positive con- 
stant c(P) depending only on such that for any N, 

Var(F 7V (/3)) > c((3). 

We can now use Theorem 13. 101 to prove that for some positive constant c(/3) 
depending only on f3, we have that for any N and t, 

(26) E(l {CTl=CT2} ) 0li >c(0)e- Nt ' c W. 

However, we also have by Theorem 13.11 that E(l/ O .i =<j ai)o ) t is a decreasing 
function of t, and hence 

E(l{ (T i =(T 2})o,t > E(l{ <T i =(T 2})o,oo = 2"^. 

Combined with p4l. (|25|l and ([26]) . this completes the proof. □ 
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